Unsupervised multimodal processing

نویسندگان

Abel Nyamapfene

Khurshid Ahmad

چکیده

We present two separate algorithms for unsupervised multimodal processing. Our first proposal, the singlepass Hebbian linked self-organising map network, significantly reduces the training of Hebbian-linked selforganising maps by computing in a single epoch the weights of the links associating the separate modal maps. Our second proposal, based on the counterpropagation network algorithm, implements multimodal processing on a single self-organising map, thereby eliminating the network complexity associated with Hebbian linked self organising maps. When assessed on two bimodal datasets, an audio-acoustic speech utterance dataset and a phonological-semantics child utterance dataset, both approaches achieve smaller computation times and lower crossmodal mean squared errors than traditional Hebbian linked self-organising maps. In addition, the modified counterpropagation network leads to higher crossmodal classification percentages than either of the two Hebbianlinked self-organising map approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Apprentissage spatial de corrélations multimodales par des mécanismes d'inspiration corticale. (Spatial learning of multimodal correlations in a cortically inspired way)

This thesis focuses on unifying multiple modal data flows that may be provided by sensors of anagent. This unification, inspired by psychological experiments like the ventriloquist effect, is based ondetecting correlations which are defined as temporally recurrent spatial patterns that appear in the inputflows. Learning of the input flow correlations space consists on sampling this ...

متن کامل

Unsupervised Visual Sense Disambiguation for Verbs using Multimodal Embeddings

We introduce a new task, visual sense disambiguation for verbs: given an image and a verb, assign the correct sense of the verb, i.e., the one that describes the action depicted in the image. Just as textual word sense disambiguation is useful for a wide range of NLP tasks, visual sense disambiguation can be useful for multimodal tasks such as image retrieval, image description, and text illust...

متن کامل

Neural Gas for Sequences

For unsupervised sequence processing, standard self organizing maps (SOM) can be naturally extended by recurrent connections and explicit context representations. Known models are the temporal Kohonen map (TKM), recursive SOM, SOM for structured data (SOMSD), and HSOM for sequences (HSOM-S). We discuss and compare the capabilities of exemplary approaches to store different types of sequences. A...

متن کامل